interaction screening
Reviews: Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models
The main effort is to improve upon the recent results provided by Bresler, showing that the complexity of identifying the structure of max degree d Ising model is polynomial in p and independent of d. Strong Points: 1) The timeliness of the topic in this paper is good, meaning that there is currently ongoing interest and work on Ising model reconstruction. Weak points: 1) The whole approach is based on the introduction of the ISO. This is the main trick in the proposed approach. Other than that, the rest of the method and its analysis are usual and well studied (l_1-penalization and connection with the tutorial by Negahban et.
META-ANOVA: Screening interactions for interpretable machine learning
Choi, Yongchan, Park, Seokhun, Park, Chanmoo, Kim, Dongha, Kim, Yongdai
There are two things to be considered when we evaluate predictive models. One is prediction accuracy,and the other is interpretability. Over the recent decades, many prediction models of high performance, such as ensemble-based models and deep neural networks, have been developed. However, these models are often too complex, making it difficult to intuitively interpret their predictions. This complexity in interpretation limits their use in many real-world fields that require accountability, such as medicine, finance, and college admissions. In this study, we develop a novel method called Meta-ANOVA to provide an interpretable model for any given prediction model. The basic idea of Meta-ANOVA is to transform a given black-box prediction model to the functional ANOVA model. A novel technical contribution of Meta-ANOVA is a procedure of screening out unnecessary interaction before transforming a given black-box model to the functional ANOVA model. This screening procedure allows the inclusion of higher order interactions in the transformed functional ANOVA model without computational difficulties. We prove that the screening procedure is asymptotically consistent. Through various experiments with synthetic and real-world datasets, we empirically demonstrate the superiority of Meta-ANOVA
Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models
Vuffray, Marc, Misra, Sidhant, Lokhov, Andrey, Chertkov, Michael
We consider the problem of learning the underlying graph of an unknown Ising model on p spins from a collection of i.i.d. We suggest a new estimator that is computationally efficient and requires a number of samples that is near-optimal with respect to previously established information theoretic lower-bound. Our statistical estimator has a physical interpretation in terms of "interaction screening". The estimator is consistent and is efficiently implemented using convex optimization. We prove that with appropriate regularization, the estimator recovers the underlying graph using a number of samples that is logarithmic in the system size p and exponential in the maximum coupling-intensity and maximum node-degree.
Statistical mechanics of the inverse Ising problem and the optimal objective function
Institute for Theoretical Physics, University of Cologne, Z ulpicher Straรe 77, 50937 Cologne, Germany The inverse Ising problem seeks to reconstruct the parameters of an Ising Hamiltonian on the basis of spin configurations sampled from the Boltzmann measure. Over the last decade, many applications of the inverse Ising problem have arisen, driven by the advent of large-scale data across different scientific disciplines. Recently, strategies to solve the inverse Ising problem based on convex optimisation have proven to be very successful. Examples are the pseudolikelihood method and interaction screening. In this paper, we establish a link between approaches to the inverse Ising problem based on convex optimisation and the statistical physics of disordered systems. We characterise the performance of an arbitrary objective function and calculate the objective function which optimally reconstructs the model parameters. We evaluate the optimal objective function within a replica-symmetric ansatz and compare the results of the optimal objective function with other reconstruction methods. Apart from giving a theoretical underpinning to solving the inverse Ising problem by convex optimisation, the optimal objective function outperforms state-of-the-art methods, albeit by a small margin. The advent of large-scale data across different scientific disciplines, especially biology, has inspired many applications of the inverse Ising problem. Over the last decade, the inverse Ising problem has been used to analyze neural firing patterns [1] and gene expression data [2], to infer biological fitness landscapes [3, 4], and to analyze financial data [5].
Interaction pursuit in high-dimensional multi-response regression via distance correlation
Kong, Yinfei, Li, Daoji, Fan, Yingying, Lv, Jinchi
Feature interactions can contribute to a large proportion of variation in many prediction models. In the era of big data, the coexistence of high dimensionality in both responses and covariates poses unprecedented challenges in identifying important interactions. In this paper, we suggest a two-stage interaction identification method, called the interaction pursuit via distance correlation (IPDC), in the setting of high-dimensional multi-response interaction models that exploits feature screening applied to transformed variables with distance correlation followed by feature selection. Such a procedure is computationally efficient, generally applicable beyond the heredity assumption, and effective even when the number of responses diverges with the sample size. Under mild regularity conditions, we show that this method enjoys nice theoretical properties including the sure screening property, support union recovery, and oracle inequalities in prediction and estimation for both interactions and main effects. The advantages of our method are supported by several simulation studies and real data analysis.
Innovated interaction screening for high-dimensional nonlinear classification
Fan, Yingying, Kong, Yinfei, Li, Daoji, Zheng, Zemin
This paper is concerned with the problems of interaction screening and nonlinear classification in a high-dimensional setting. We propose a two-step procedure, IIS-SQDA, where in the first step an innovated interaction screening (IIS) approach based on transforming the original $p$-dimensional feature vector is proposed, and in the second step a sparse quadratic discriminant analysis (SQDA) is proposed for further selecting important interactions and main effects and simultaneously conducting classification. Our IIS approach screens important interactions by examining only $p$ features instead of all two-way interactions of order $O(p^2)$. Our theory shows that the proposed method enjoys sure screening property in interaction selection in the high-dimensional setting of $p$ growing exponentially with the sample size. In the selection and classification step, we establish a sparse inequality on the estimated coefficient vector for QDA and prove that the classification error of our procedure can be upper-bounded by the oracle classification error plus some smaller order term. Extensive simulation studies and real data analysis show that our proposal compares favorably with existing methods in interaction selection and high-dimensional classification.